The Improved Job Scheduling Algorithm of Hadoop Platform

نویسندگان

  • Yingjie Guo
  • Linzhi Wu
  • Wei Yu
  • Bin Wu
  • Xiaotian Wang
چکیده

[Abstract] This paper discussed some job scheduling algorithms for Hadoop platform, and proposed a jobs scheduling optimization algorithm based on Bayes Classification viewing the shortcoming of those algorithms which are used. The proposed algorithm can be summarized as follows. In the scheduling algorithm based on Bayes Classification, the jobs in job queue will be classified into bad job and good job by Bayes Classification, when JobTracker gets task request, it will select a good job from job queue, and select tasks from good job to allocate JobTracker, then the execution result will feedback to the JobTracker. Therefore the scheduling algorithm based on Bayes Classification influence the job classification via learning the result of feedback with the JobTracker will select the most appropriate job to execute on TaskTracker every time. We need to consider the feature usage of job resource and the influence of TaskTracker resource on task execution, the former of which we call it job feature, for instance, the average usage rate of CPU and average usage rate of memory, the latter node feature, such as the usage rate of CPU and the size of idle physical memory, the two are called feature variables. Results show that it has a significant improvement in execution efficiency and stability of job scheduling. [Key Words]: Hadoop; scheduling algorithm; improvement; Bayes

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comparative Analysis of MapReduce Scheduling Algorithms for Hadoop

Today’s Digital era causes escalation of datasets. These datasets are termed as “Big Data” due to its massive amount of volume, variety and velocity and is stored in distributed file system architecture. Hadoop is framework that supports Hadoop Distributed File System (HDFS)for storing and MapReduce for processing of large data sets in a distributed computing environment. Task assignment is pos...

متن کامل

A Hadoop Job Scheduling Algorithm Based on Pagerank

Aiming at the problem that the job scheduling algorithm based on the classical model of cloud computing in Hadoop is not high, the new job scheduling algorithm based on PageRank algorithm is proposed, Under the premise of ensuring the user experience, we propose a new job scheduling algorithm named ValidRank, which is based on the combination of hierarchical weight and waiting time. Then for th...

متن کامل

A Task Scheduling Algorithm for Hadoop Platform

MapReduce is a kind of software framework for easily writing applications which process vast amounts of data on large clusters of commodity hardware. In order to get better allocation of tasks and load balancing, the MapReduce work mode and task scheduling algorithm of Hadoop platform is analyzed in this paper. According to this situation that the number of tasks of the smaller weight job is mo...

متن کامل

Improved Fair Scheduling Algorithm for Hadoop Clustering SNEHA and SHONEY SEbASTIAN

Traditional way of storing such a huge amount of data is not convenient because processing those data in the later stages is very tedious job. So nowadays, Hadoop is used to store and process large amount of data. When we look at the statistics of data generated in the recent years it is very high in the last 2 years. Hadoop is a good framework to store and process data efficiently. It works li...

متن کامل

Job Attentive Scheduling Algorithm in Hadoop

In recent years cloud services have gained much attention as a result of their availability, scalability, and low cost. One use of these services has been for the execution of scientific workflows as part of Big Data Analytics, which are employed in a diverse range of fields including astronomy, physics, seismology, and bioinformatics. There has been much research on heuristic scheduling algori...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره abs/1506.03004  شماره 

صفحات  -

تاریخ انتشار 2015